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We present a nonparametric way to retrieve an additive system of differential equations in embed- 
ding space from a single time series. These equations can be treated with dynamical systems theory 
and allow for long term predictions. We apply our method to a modified chaotic Chua oscillator in 
order to demonstrate its potential. 
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Casting physical observations into mathematical equa- 
tions is one of the fundamental tasks to understand and 
predict dynamical systems. Basically, there are two com- 
plementary approaches to accomplish this task: theoret- 
ically by convenient considerations and empirically by 
data analysis. Both approaches are essential for mod- 
ern modeling strategies. If For many systems, the dy- 
namics is not directly accessible to theoretical consid- 
erations; then an appropriate data analysis is essential. 
This problem is very general, one can find it in classical 
fields of physics, e.g. classical mechanics, fluid dynamics, 
solid state physics, statistical physics as well as in more 
interdisciplinary fields, e.g., physiology, earth sciences, 
economics or biological systems. In this paper the data 
analysis issue is addressed: we determine an analytically 
treatable set of additive equations in embedding space by 
the method of nonparametric embedding. This approach 
is a priori parameter-free; but subsequent parameteriza- 
tion can be helpful for analytical representation of the 
involved functions. 

Often, the measurement of a complex system does not 
yield the whole set of state variables. The missing dy- 
namics can be accessed by the embedding technique 
Given the measurement of a subset of variables, one can 
infer the missing information by an embedding map, e.g., 
by using the time-delayed variables or their derivatives. 
This has been proven rigorously for a wide class of sys- 
tems It is, however, not known how the equations 
of the dynamical system in embedding space are struc- 
tured. In this communication, we propose a technique to 
find a set of equations which allows a reproduction of the 
dynamics in phase space for the class of additive systems. 

There are several excellent reviews about embedding 
QilEQ; therefore, we only repeat some basic facts. We 
consider a system governed by a set of ordinary differen- 
tial equations: 

S!=F{x), (1) 

where x e R", : R" 1^ R". This set of equations de- 
fines a flow. Ft, in phase space. We assume that there 
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exists an attractor A C R" with the box-counting dimen- 
sion d < n. In 2] it has been shown that almost every 
smooth map vj/ : R" M™, m > 2d, is an embedding, i.e. 
a smooth diffeomorphism from A onto its image 'i'{A). 
The condition m > 2c? is sufficient, therefore cases with 
d < m < 2d can occur. 

Due to differentiability, the dynamics of — '^{x{t)) 
obeys an ordinary differential equation in embedding 
space: 

m , (2) 

with ,f G R™, $ : M™ 1-^ M". In this article, we focus on 
additive models for the components <I>i and show how to 
retrieve them from data. 

One standard way of embedding is the use of the delay- 
coordinate map H{f,T), with the smooth observation 
function, / : R" t-^ R, and t, the time delay, some real 
number 2] : 

H(f, t) = (/, • • ■ , f{F.(,n-l)r)) ■ (3) 

As an example, consider the particular case of /(a?) = 
xi. Identifying the above embedding map ^ with H, 
the coordinates in embedding space are ^i{t) = f{x) = 
xi{t),^2{t) = f{F-r{x)) =xi{t- r), etc.. 

In our analysis, we perform numerical simulations for 
some model systems to obtain time series of various vari- 
ables. We then discard all but one variable to embed 
the dynamical system (|5J using the delay map. To avoid 
confusion, we will refer to dynamics from Eq. as origi- 
nal. For the counterpart, Eq. (|2Jl, to be estimated by non- 
parametric regression, we will use the term reconstructed. 
If the embedding map ^E" is concerned, embedded will be 
used - the latter meaning that a time series from the orig- 
inal system is used, i.e. without knowing the dynamical 
system 

To find a dynamical system in embedding space, sev- 
eral approaches exist, e.g., local linear fits and paramet- 
ric procedures as polynomial fits, radial basis functions 
or neural networks (cf. Local fitting is a general 

concept, but the results are neither easy to access ana- 
lytically nor to visualize due to the high dimensionality. 
Polynomial ansatzes tend to involve too many terms for 
a clear identification of a mathematical or physical struc- 
ture; for neural networks a physical interpretation is very 
hard. 
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Now we describe our procedure in more detail: Consid- 
ering each temporal measurement as a realization of the 
flow, one obtains as a best estimator of the components 
of Eq. Q in the least-square sense 



(4) 



with £'[-|-] the conditional expectation value (CEV) op- 
erator. It is a very hard task to extract analytical mod- 
els from Eq. ; visualization is obviously impossible for 
m > 2. 

To tackle this problem, we require the rhs of Eq. (0) 
to be an additive model: 



(5) 



This is a subset of the class of models considered by Kol- 
mogorov 0: he showed rigorously that it is possible to 
represent any continuous function of a set of m variables 
as a 2m -f 1-fold superposition of m functions of one argu- 
ment. Below, we show that despite the less general for- 
mulation it is possible to reconstruct a chaotic dynamical 
system. Our model (jSj is, however, still in a wider model 
class than in parametric methods, because we do not rely 
on a given set of basis functions. After having finally es- 
timated the components 4>ij, we can easily visualize the 
functions and try analytical formulae. 

It is worth noting the geometrical aspect of our ap- 
proach: Eq. ^ defines a differentiable manifold approx- 
imated by the sum of the functions 0y , cf. Eq. |SJ . This 
is possible within a certain scatter, which is quantified be- 
low by the correlation. If the manifold is found exactly 
by the model, the correlation is 1. Dynamical and topo- 
logical properties of the original system are mirrored in 
embedding space. Long-term predictions of the dynam- 
ics are thus possible on the basis of the obtained model 
if the correlation is close to 1, which is a very strong 
advantage. 

The optimal estimate for the is calculated by the 
backfitting algorithm 0. It works by alternately apply- 
ing the CEV operator to projections of <I>i on the coor- 



dinates: 4>ij{S,j) = -EiCi - Y,k=/.] 4>ik I 01' ^^'^ is proven 
to converge to the global optimum in the least-square 
(Eq. (O) or correlation (Eq. iQ) sense. For the applica- 
tion to spatio-temporal data analysis, see 0,I3- We cal- 
culate the CEV by smoothing splines, which are optimal 
for nonparametric regression j^] , due to their smoothness 
and differentiability properties. It is important to note 
that the parameters used by splines or other estimators 
are method-inherent and not prescribed by a preselected 
model; in this sense the model is parameter free. 

As an overall quality measure, the least-square error 
can be used 



E 



fc=i 
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The backfitting method, however, is formulated as opti- 
mal in the sense of correlation, i.e. the natural measure 
is the correlation coefficient C^o between rhs and Ihs in 
Eq. (|3J). The correlation coefficient Cy, given in Eq. ||7J), 
indicates its individual weight for the model: 



Cio — C 



>ik 



Cij — C 



>ik 



(7) 

We will use Cm as a quantitative measure, but not x^- 
Putting $i = V, X^feLi 4>ik = X, we give the relation 
between both measures: 2 ■ ■ ^VAR{X) ■ VAR{Y) = 
VAR{X) + VAR{Y) + [E{X - Y)]^ ~ xf, with VAR the 
variance. A correlation close to 1 means the manifold 
described by Eq. Q is approximated very well, lower 
correlations indicate scatter of data points around the 
manifold. In the case of experimental data, measurement 
noise can produce some additional scatter. 

In the following, the procedure is illustrated by the 
example of a modified Chua circuit with a third order 
nonlinearity. The basic equations read: 



xi — a{mQXi — 1/3 mixi + X2) , 

X2 ^ Xi - X2 + X3 , X3 = -5X2 • 



(8) 



Written as an additive model ISJ these equations read 

Xl = fl,l{xi) + /l, 2(2:2) , X2 = f2,l{xi) + /2, 2(2^2) + 

/2, 3(2:3) , 2:3 = /3, 2(2:2) , with the linear functions /i^2, 
/2,ij /3,2 and /i^i a third order polynomial. 

We integrate the system © with a = 18, 6 = 33, itiq = 
0.2, mi — 0.01 numerically by a Runge-Kutta algorithm 
of 4th order. The time series of the first component is 
used for embedding. Results do not change for other 
components. We first discuss embedding dimension m = 
3. From the time series x{t), the points — x(t — T{i — 
1)), ^1 = x{t), ^2 = x{t — r), ^3 = x{t — 2t) are used. The 
derivative is taken directly from the integration, this is 
more exact than the estimate by finite differences. The 
nonparametric regression yields the functions (/jy for the 
resulting dynamical system . 

First, we present results for the specific delay, r = 0.2, 
we study below the dependence of the results on the delay 
time. For r = 0.2, the embedded and the reconstructed 
attractor are shown together in Fig. ^ With respect to 
the data analysis, we want to quantify i) the quality of 
the regression, ii) the importance of the functions (t)ij, 
and iii) the functions themselves. 

i) The quality of the regression is given by the corre- 
lation Cio, cf. We find in our case do — 0.992, 
C20 = 0.999, C30 = 0.995, such that the modeling error 
is very small. 

ii) The importance of functions is found by the coeffi- 
cients C^, defined in Eq. JT)) (i,j = 1,2,3). We find 
Cij > 0.99 V«,j, consequently every function is substan- 
tial here (cf. Fig. 121). Given that we analyzed 50,000 data 
points, the refer to a very high correlation. There- 
fore we infer a property of the embedding transformation: 
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FIG. 1: Embedding of the first component of the system © 
with delay r — 0.2. The system, embedded with m = 3 
(bottom) is shown together with the reconstructed trajectory 
from the integration of the systems obtained by nonparamet- 
ric regression for embedding dimension 3 (middle) , and 4 
(top), respectively. An offset is added to avoid overlap of the 
attractors. 
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FIG. 2: Reconstructed functions i = 1, 2, 3 in embedding 
space, a) <j)-ij, b) 02j, c) if^j, for j — 1 (solid line), j — 2 
(dotted line), and j = 3 (dashed line). All functions are 
important with Cn = 0.999, C12 = 0.99 , C13 = 0.998 , 
C21 = 0.999 , C22 = 0.991 , C23 = 0.999 , C31 = 0.997 , 
C32 = 0.999 , C33 = 0.999. 



each of the embedding space coordinates £,i contains in- 
formation necessary for the dynamics. 

iii) The nine functions (f>ij, displayed in Fig|21are the 
most important result for an application. All functions 
are important and nonlinear, to a good approximation 
of cubic order; only appears to be a piecewise linear 
function. The quantitative comparison of the dynamics 
of the reconstructed and the original system is done by 
i) calculation of the fixed points, ii) their stability and 
iii) the Lyapunov exponents (LE's) of the reconstructed 
system. These quantities have to coincide with the ones 
of the original system. 

i) Fixed points. We solved Y^'^j=i4'ij = {i — 1,2,3) 
numerically with the functions from the output of the 
analysis. The three fixed points of the embedded system 
are (-7.75, -7.75, -7.75), (0, 0, 0), (7.75, 7.75, 7.75) with 
an accuracy of 10^"^ . In the system, reconstructed with 
T = 0.2, the fixed points are = (-7.76, -7.55, -7.66), 
{* = (7.75, 7.52, 7.68), and = (0, 0, 0) with an error of 



FIG. 3: Lyapunov exponents for original and reconstructed 
system for embedding dimension m=3 (a) and m=4 (b). In- 
creasing m results in a larger window in the delay time for 
which the system is reconstructed, i.e. the LE's coincide well. 
The thin dotted lines indicate the LE's for the original system, 
Ai = 0.432, A2 = 0, A3 = —6.31, the straight, dash-dotted and 
dashed lines the correspondent ones for the reconstructed sys- 
tem. 



less than 1%. 

ii) Stability analysis. The eigenvalues, corresponding 
to the above fixed points are 71 = (—7.68, 0.47 -I- 
i 4.45, 0.47 - i 4.45), 72 = (-7.62,0.58 -^^4.55,0.59 - 
i4.55), 73 = (5.09,-1.16 -I- ? 4.56, -1.16 - i 4.56) to 
be compared with the ones of the original Chua sys- 
tem: 7o.i = (-8.76,0.28 -I- i 5.20, 0.28 - i 5.20), 70,2 = 
(-8.76, 6.28 + i 5.20, 0.28 - i 5.20), 7^,3 = (5.03, -1.21 -f 
i 4.71, —1.21 — i 4.71). For the embedded attractor, there 
is nothing to calculate due to missing equations. Fur- 
thermore, the embedding conserves dynamical proper- 
ties. The contraction rate from 71^2 is found within 15%, 
the expansion rate from 73 is found within 1%, the imag- 
inary parts coincide within 15 %. 

iii) Lyapunov exponents and dependence on the delay. 
We calculated the Lyapunov exponents of the recon- 
structed system for < t < 1. For most of the delays 
no useful reconstruction is possible, however in the 
window 0.14 < r < 0.28 the LE's are very close to the 
original ones (Fig. |3^). By eye, it is hard to recognize 
which attractor is reconstructed or embedded (Fig. 
With this study, we have determined the delay which 
is optimal in the sense of nonparametric embedding. 
Usually, the delay is chosen such that the information 
content in the delay coordinate vector is maximized. 
To do so one determines the minimum of the mutual 
information or the first zero of autocorrelation or similar 
measures It turns out that these approaches do not 
yield a delay different from ours. 

If the embedding dimension is increased, one expects 
a good reconstruction in a larger delay-time window, be- 
cause more information is used. This is confirmed by the 
calculation of the LE's with m — 4, (Fig. |2b), where a 
good reconstruction is found for 0.08 < r < 0.36. The 
attractor for r — 0.2 is shown for comparison(Fig. ^ 
top). 

At m = 4, there is a breakdown of the reconstruction 



4 




FIG. 4: Reconstruction for the Lorenz system (m = 3). A 
part of the dynamics is not reconstructed, but may be recov- 
ered with higher dimensional embedding. 

for 0.22 < T < 0.26, whereas m = 3 yields good results 
(Fig.|21l. This is unexpected and a conclusive explanation 
requires further investigation. 

A particularity of the modified Chua system is its 
additive structure. Next, we check whether a success- 
ful reconstruction can be found for dynamical systems 
with multiplicative terms, too, such as the Lorenz or the 
Rossler system. For both, we find a worse capability of 
our method to reconstruct the dynamics. For the Lorenz 
system {a = 10, p = 28, f} = 8/3), one of the best results 
appears for r = 0.09, the corresponding reconstructed 
attractor is shown in Fig. 0] Clearly, some of the dy- 
namics is lost, nevertheless a chaotic motion about the 
correct fixed points is found. The largest LE is found 
to Xrec = 0.08 to be compared with the original one, 
Ao = 0.905. Tests with to = 4, and to = 5 did not yield 
significant improvement. The reason lies probably in the 
topology of the attractor which cannot be produced by a 
purely additive model of reasonably low dimensionality. 
This is a limit of the additive modeling approach. 

We have successfully reconstructed a dynamical sys- 
tem by a set of ordinary differential equations in embed- 
ding space. We have considered additive models only, 
and have used as a typical chaotic system a modified 
Chua oscillator for illustration. The resulting equations 
can be analyzed by dynamical systems theory: we have 
investigated the fixed point structure, linear stability, 
and Lyapunov exponents and have found that these dy- 
namical characteristics quantitatively coincide with the 
ones of the original system. By studying the depen- 
dence of the results on the delay-time, we could iden- 
tify the window in which our method works very well. 
Higher embedding dimensions enlarge this window, over- 
determination can, however, let the description break 
down. For non-additive systems, our analysis works qual- 
itatively, a quantitative comparison is in general not pos- 
sible, although the result indicates which terms can be 
important in a more general model. 



In the method, the statistical backfitting algorithm is 
used for an estimation of the CEV; the result is a set of 
optimal functions (pij . It is inherently insensitive against 
noise |^ and can be generalized in many ways. The 
results are functions of one variable and can be visual- 
ized and approximated by analytical formulae after the 
backfitting procedure. This yields an important advan- 
tage: when fitting polynomials or other basis systems one 
chooses these functions beforehand, this is not needed in 
the nonparametric approach, and the result is still inter- 
pretable. From a practical point of view, the input data 
are crucial for a good estimation on a connected region 
and estimation of derivatives. Asymptotics and gaps due 
to missing data have to be treated with great care or in- 
stead of derivatives one might rather use a mapping ap- 
proach 10] . Decision on additivity of a model works by 
statistical measures (correlations or least-squares error), 
whereas dynamical measures, as Lyapunov Exponents, 
indicate how well the dynamics is reproduced. 

From a theoretical point of view, we formulate the 
following, general question: given a dynamical system 
(additive/multiplicative structure), which topology of a 
corresponding attractor is possible? Vice versa, given 
a topology and dynamics, which is the structure of the 
underlying dynamical system? We have treated a given 
topology (of the Chua, Lorenz and Rossler system); tak- 
ing into account the embedding theorem, the problem is 
transferred to embedding space. There, we have recon- 
structed a dynamical system of additive structure, suc- 
cessfully for the Chua system, less convincingly for the 
Lorenz and Rossler system. This suggests that the addi- 
tive structure is kept. Mathematically, related questions 
have been treated in Q . A key role is played by the non- 
linear embedding transformation which can distort the 
system considerably. To our knowledge the above ques- 
tions are open and touch the core of modern theory of 
dynamical systems. 

Current and future activities focus on generalization 
to reconstruct mixed additive/multiplicative models fol- 
lowing Kolmogorovs ideas, especially for real data. One 
goal is to follow the way from the general model (0) to a 
purely additive model Q. Finding the model which in- 
volves the least possible multiplicative and additive terms 
yields a considerable ease to analyze the systems. With 
an analytical expression a detailed analysis and long-term 
prediction of a (chaotic) orbit is possible, this is an un- 
precedented result. Applications for our method reach 
from geophysics and climatology to biology and medicine, 
where the prediction of, e.g., climate change or illness de- 
tection are topics of superior interest. 
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